How to avoid spurious cluster validation? A methodological investigation on simulated and fMRI data.
نویسندگان
چکیده
This paper presents an evaluation of a common approach that has been considered as a promising option for exploratory fMRI data analyses. The approach includes two stages: creating from the data a sequence of partitions with increasing number of subsets (clustering) and selecting the one partition in this sequence that exhibits the clearest indications of an existing structure (cluster validation). In order to achieve that the selected partition is actually the best characterization of the data structure, previous studies were directed to find the most appropriate validity function(s). In our analysis protocol, we first optimize the sequence of partitions according to the given objective function. Our study showed that an insufficient optimization of the partition, for one or more numbers of clusters, can easily yield a spurious validation result which, in turn, may lead the analyst to a misleading interpretation of the fMRI experiment. However, a sufficient optimization, for each included number of clusters, provided the basis for a reliable, adequate characterization of the data Furthermore, it enabled an adequate evaluation of the validity functions. These findings were obtained independently for three clustering algorithms (representing the hard and fuzzy clustering variant) and three up-to-date cluster validity functions. The findings were derived from analyses of Gaussian clusters, simulated data sets that mimic typical fMRI response signals, andreal fMRI data. Based on our results we propose a number of options of how to configure improved clustering tools.
منابع مشابه
Improving the Performance of ICA Algorithm for fMRI Simulated Data Analysis Using Temporal and Spatial Filters in the Preprocessing Phase
Introduction: The accuracy of analyzing Functional MRI (fMRI) data is usually decreases in the presence of noise and artifact sources. A common solution in for analyzing fMRI data having high noise is to use suitable preprocessing methods with the aim of data denoising. Some effects of preprocessing methods on the parametric methods such as general linear model (GLM) have previously been evalua...
متن کاملFeature Selection Based on Genetic Algorithm in the Diagnosis of Autism Disorder by fMRI
Background: Autism Spectrum Disorder (ASD) occurs based on the continuous deficit in a person’s verbal skills, visual, auditory, touch, and social behavior. Over the last two decades, one of the most important approaches in studying brain functions in autistic persons is using functional Magnetic Resonance Imaging (fMRI). Objectives: It is common to use all brain regions in functional extracti...
متن کاملA Comparative Study of Class Activities and Students’ Expectations of IELTS and TOEFL iBT Preparation Courses: A Methodological Triangulation Washback Study
Washback refers to the influence of a test on teaching and learning. This study was an attempt to compare the influence of IELTS and TOEFL iBT on the expectations the students brought to their courses and to investigate how these expectations were fulfilled. To this end, 100 IELTS and 120 TOEFL iBT students attending preparation courses took a questionnaire survey, and a sample of their ten cla...
متن کاملSpurious Hyperleukocytosis
Hyperleukocytosis is an oncological emergency but is extremely rare in non-malignant conditions. Nucleated RBCs give rise to spuriously high total leucocyte count and cause clinical dilemma. Thalassemia major patients are known to have leucocytosis even after correction for nucleated RBCs. We report a case of undiagnosed Thalassemia major in a 4 month old infant with total leucocyte count highe...
متن کاملFeature selection using genetic algorithm for classification of schizophrenia using fMRI data
In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- NeuroImage
دوره 17 1 شماره
صفحات -
تاریخ انتشار 2002